home *** CD-ROM | disk | FTP | other *** search
-
- ~4Dgifts/toolbox/public/libdmalloc README
-
-
- Libdmalloc is a debugging malloc library I wrote with ideas I stole from
- Conor P. Cahill's dbmalloc, Brandyn Webb's malloc-debug, Kipp Hickman's
- "leaky", Purify, and sgi's old dbx source. It is not a finished product.
- Currently the only environment in which it works is IRIX5 and maybe IRIX4
- (the only problem with IRIX4 should be benign complaints regarding getwd).
- I believe it is MP-safe.
-
- It's very useful for finding memory corruption quickly in many different
- programs, since it has little overhead and does not require programs to
- be compiled in any special way. It uses the ability to plug in different
- a malloc library using DSOs.
-
- Please send all questions and comments to me, Don Hatch (hatch@sgi.com).
-
- ===============================================================================
-
- WHAT LIBDMALLOC DOES
-
- libdmalloc.a contains the following standard functions:
- malloc
- free
- realloc
- calloc
- cfree
- These are actually wrappers for the functions from libmalloc.a,
- which are also included in libdmalloc, but with disguised names
- mAlLoC, fReE, etc.
- The wrapper functions maintain malloc statistics, and do the following
- other good stuff:
- -- Initialize newly malloced memory to 1's, to break
- programs that depend on it being 0's
- -- Fill freed memory with 2's, to break programs that
- look at freed memory.
- -- free() and realloc() do sanity checks to make sure
- the area immediately surrounding the memory has not
- been modified (8 bytes or so on either side).
- If corruption is detected, it attempts to print a current
- stacktrace and also a stacktrace of the original malloc if possible.
- -- during exit(), all malloced memory is checked for corruption.
-
- ===============================================================================
-
- LINKING A PROGRAM WITH LIBDMALLOC
-
- There are two ways to link libdmalloc into your program.
- The first is simply to tell rld to do it at runtime:
- setenv _RLD_LIST /usr/tmp/libdmalloc.so:DEFAULT
- Then any non-setuid dynamic executable you run (use the 'file' program
- to determine whether an executable is dynamic or not) will use libdmalloc.
- [[ Hint: if you will want to run dbx on a stripped executable, dbx will
- work better (i.e. be able to make interactive calls) if your
- _RLD_LIST also contains crt1.so. You can make one of these as follows:
- ld -shared /usr/lib/crt1.o -o /usr/tmp/crt1.so
- setenv _RLD_LIST /usr/tmp/libdmalloc.so:/usr/tmp/crt1.so:DEFAULT ]]
-
- The above runtime-linking method will give the full memory-corruption
- detection power of libdmalloc, but the stack tracing routines
- will not work (and may core dump). So if you want to
- use libdmalloc to detect leaks, or if you want libdmalloc
- to give you a stack trace of the original call to malloc
- when corruption is detected, you will have to link the program
- with libdmalloc.a and its dependent libraries.
- The arguments to give the linker are:
- libdmalloc.a -lmld -lexc -lmpc -lmangle
- Try to link with .a's rather than .so's for the rest
- of the libraries wherever possible; the stack tracing routines
- really are not robust inside DSOs.
-
- ===============================================================================
-
- MEMORY CORRUPTION DETECTION
-
- CORE DUMPS THAT DON'T OCCUR WITH THE REGULAR MALLOC
- If this happens, it probably means the program is using uninitialized
- or freed memory; libdmalloc intentionally fills such regions
- with a fill pattern to break such programs.
-
- If you are running a debugger on a program linked with libdmalloc
- and you see the value "^A^A^A^A^A..." or 0x1010101 or 16843009
- it usually means you are looking at an uninitialized malloced area;
- if you see "^B^B^B^B^B..." or 0x2020202 or 33686018,
- it usually means you are looking at an area that has been
- freed already.
-
- UNDERFLOWS AND OVERFLOWS
- When libdmalloc reports "overflow" corruption at a malloced address,
- it means data has been illegally written past the end of the array.
- Likewise "underflow" corruption means data has been illegally written
- in the region immediately preceding the array.
- libdmalloc checks at least the 8 bytes immediately
- preceding and following the malloced array for
- underflow and overflow corruption when the array is
- free()d or realloc()ed, and all malloced arrays are checked
- during exit(). When corruption is detected, error messages get
- sent to stderr.
-
- Error messages for overflows and underflows look something like this:
- % oawk /./ /dev/null
- oawk(24863): ERROR: underflow corruption detected during free
- at malloc block 0x100d8fb0
-
- If you have compiled the program with libdmalloc.a
- and the offending mallocs and frees aren't called
- from within a DSO, you may be able to get libdmalloc to give you
- stack traces of the free() and the original malloc():
- % unsetenv _RLD_LIST
- % setenv MALLOC_STACKTRACE_GET_DEPTH 10
- % oawk /./ /dev/null
- oawk(24879): ERROR: underflow corruption detected during free
- at malloc block 0x100d8fb0
- 0 free() [dmalloc.c:892, 0x418000]
- 1 freetr() [b.c:99, 0x40ccd0]
- 2 freetr() [b.c:103, 0x40cd20]
- 3 freetr() [b.c:108, 0x40cd5c]
- 4 freetr() [b.c:103, 0x40cd20]
- 5 makedfa() [b.c:65, 0x40cae8]
- 6 yyparse() [awk.g.y:181, 0x40ac6c]
- 7 main() [main.c:132, 0x40e984]
- 8 __start() [crt1text.s:133, 0x409bb0]
- This block may have been allocated here:
- 0 add() [b.c:284, 0x40d598]
- 1 cfoll() [b.c:167, 0x40d07c]
- 2 cfoll() [b.c:173, 0x40d0d0]
- 3 cfoll() [b.c:177, 0x40d0ec]
- 4 cfoll() [b.c:173, 0x40d0d0]
- 5 makedfa() [b.c:62, 0x40cab8]
- 6 yyparse() [awk.g.y:181, 0x40ac6c]
- 7 main() [main.c:132, 0x40e984]
- 8 __start() [crt1text.s:133, 0x409bb0]
-
- If not (e.g. if unshared libraries are not available,
- or you don't want to recompile) you can probably
- still track down the original malloc in the debugger.
-
- -- To stop the program when corruption
- is detected, set a breakpoint in malloc_bad().
-
- -- To stop the program when malloc is about to fail (return 0),
- set a breakpoint in malloc_failed().
-
- -- To stop the program when malloc is about to return
- a particular address (say 0x100d8fb0, as in the above example)
- setenv MALLOC_BLOCK_OF_INTEREST 0x100d8fb0
- and set a breakpoint in malloc_of_interest().
-
- -- If while running a program in the debugger
- you want to know whether a given malloced array has overflowed
- or underflowed yet, call malloc_isgoodblock(addr).
- (See the above hint about using crt1.so to enable dbx to make
- interactive calls if you are debugging a stripped program).
- To check all malloced blocks at once, call malloc_check();
- the returned value will be 0 for success,
- or -1 (with error messages to stderr) if corruption is detected.
- malloc_check() is called automatically during exit().
-
- [[ Hint: if you are using the _RLD_LIST variable and
- the main program is stripped, dbx probably won't be able to find
- symbols in libdmalloc.so until the program is actually running.
- One way to do this is to set a breakpoint in getenv, run
- the program until it stops in getenv (dbx will seg fault, but
- don't worry, it's okay :-)), and then delete the breakpoint;
- then the desired symbols should be visible. ]]
-
- SUPPRESSING ERROR MESSAGES
- Certain known malloc overflow bugs always appear at a constant
- address in a program; the MALLOC_SUPPRESS environment variable
- can be used to suppress error messages about such bugs.
- For example, if you are tired of seeing the following messages:
- strings(24675): ERROR: overflow corruption detected during exit
- at malloc block 0x100010d0 (4 bytes)
- CC(24678): ERROR: overflow corruption detected during exit
- at malloc block 0x1001b9d8 (5 bytes)
- you can do the following:
- setenv MALLOC_SUPPRESS " \
- CC:0x1001b9d8 \
- /bin/CC:0x1001b9d8 \
- /usr/bin/CC:0x1001b9d8 \
- strings:0x100010d0 \
- /bin/strings:0x100010d0 \
- "
- Note that there is a separate entry for each likely value of argv[0].
- Note also that libdmalloc may not be able to determine argv[0]
- if main() has overwritten it, so suppression may not always work.
-
- ===============================================================================
-
- LEAK DETECTION
- [[ XXX This section is slightly out-of-date; I haven't tried
- this stuff in a while, and there may be better ways to do things now.
- In particular, the new MALLOC_CHECK_ATEXIT environment variable
- is probably a good way to look for leaks during exit().
- CaseVision probably does a better job of all this anyway. ]]
-
- libdmalloc also contains the additional functions and variables,
- defined in "dmalloc.h":
- void malloc_reset();
- Sets all counts to zero.
- void malloc_info(int nonleaks_too, int stacktrace_print_depth);
- Prints out stacktraces of all leaks that have occurred
- since the last call to malloc_reset() (actually
- prints a histogram indexed by stack trace).
- If nonleaks_too is nonzero, the stacktraces of all
- mallocs and frees (not just leaks) will be printed.
- The stacktrace_print_depth argument specifies how
- much of each stacktrace you want to see; -1 means
- the entire trace (which is limited by the
- malloc_stacktrace_get_depth variable, see below).
- If the environment variable MALLOC_INFO_ATEXIT
- is set, then malloc_info(0,-1) will be called during exit().
- void malloc_info_cleanup();
- Frees all resources opened by malloc_info().
- malloc_info() reads the symbol table of the executable
- object file (and all shared objects on which the executable
- depends) the first time it needs to look up
- a function name, file name and line number from a pc address;
- this is very time-consuming, so it leaves these
- files and symbol tables open.
- malloc_info_cleanup() attempts to close them and free the space.
- NOTE: as of this writing, libmld leaks like a sieve
- (see incident #176726) so malloc_info_cleanup may not
- be very effective.
- void malloc_failed();
- This no-op function gets called whenever malloc fails
- for any reason. It exists solely for breakpoint debugging.
- int malloc_stacktrace_get_depth;
- This integer controls how deep a stacktrace to get and store
- at each malloc and free.
- Its initial value is 0, or the
- value of the environment variable MALLOC_STACKTRACE_GET_DEPTH
- if set. Applications may change this value
- at any time. There is a hard-coded max depth of 10
- (to change this, recompile this library with
- -DMAX_STACKTRACE_DEPTH=20 or whatever).
- Values less than 0 or greater than the max depth
- are taken to mean the max depth.
- int malloc_fillarea;
- If nonzero, then newly malloced memory will be
- initialized to 1's and newly freed memory will
- be filled with 2's.
- The initial value of this variable is nonzero.
- I recommend not changing it.
- void malloc_init_function();
- This is a no-op function that gets called at the
- beginning of the very first call to malloc().
- It is designed to be overridden by applications
- if they wish to do things before the first malloc
- (putting code at the top of main() is not sufficient,
- since malloc can get called by pre-main initialization
- routines).
- For example, libmallocGadget's version of this function
- looks for the environment variables
- MALLOC_GADGET_STACKTRACE_GET_DEPTH and MALLOC_GADGET_M_KEEP
- and sets malloc_stacktrace_get_depth and mallopt()
- appropriately. Note that this overrides env
- MALLOC_STACKTRACE_GET_DEPTH; (i.e. MALLOC_STACKTRACE_GET_DEPTH
- has no effect when using the malloc gadget).
-
- ===============================================================================
-
- EXAMPLE OF FINDING LEAKS USING LIBDMALLOC AND DBX:
-
- Compile the program using the libraries specified above.
- % dbx leaky_program
- (dbx) when in main { assign malloc_stacktrace_get_depth = -1 }
- (make sure stacktraces are gathered
- on each malloc and free)
- (dbx) stop at 10 (break prior to suspected leak)
- (dbx) stop at 20 (break after suspected leak)
- (dbx) run
- Process 2916 (leaky_program) stopped at leaky_program.c:10
- (dbx) ccall malloc_info(0,0) (make sure symbol table is loaded
- before reset, so it doesn't interfere
- with the statistics)
- (dbx) ccall malloc_reset() (reset all counts to 0)
- (dbx) cont
- Process 2916 (leaky_program) stopped at leaky_program.c:20
- (dbx) ccall malloc_info(0,0) (print leaks since last reset)
- (dbx) ccall malloc_info(0,-1) (print leaks since last reset,
- with full stacktraces)
- (dbx) ccall malloc_info(1,0) (print all mallocs & frees since last
- reset)
- (dbx) ccall malloc_info(1,3) (print all mallocs & frees since last
- reset, with stacktraces shown to
- depth of 3)
-
- ===============================================================================
-
- ENVIRONMENT VARIABLES LIBDMALLOC LOOKS AT
-
- MALLOC_BLOCK_OF_INTEREST (default=0)
- If set to an address, libdmalloc will call the no-op
- function malloc_of_interest() during all mallocs which
- return the given address.
- This may be useful for finding the original
- malloc of a corrupt block in a deterministic program;
- simply setenv MALLOC_BLOCK_OF_INTEREST to be the address
- given in the corruption error message, and use the debugger
- to stop in malloc_of_interest().
- MALLOC_CALL_OF_INTEREST (default=-1)
- If set to a non-negative integer n, the n'th call
- to malloc will call the no-op function malloc_of_interest().
- This can be used if you wish to stop in the n'th call to malloc()
- in the debugger.
- MALLOC_CHECK_ATEXIT (default=1)
- If this variable is nonzero (as it is by default),
- all malloced blocks will be checked for overflow and
- underflow corruption during exit().
- If set to a value >= 2, a message will be printed to stderr
- during this check, something like:
- ls(24659): Checking malloc chain at exit...done.
- This will keep you informed of which programs you are running
- are actually using libdmalloc.
- Keep in mind, however, that some programs
- (e.g. p_finalize and inst) will abort if they detect
- any stderr output from a child process.
- MALLOC_FILLAREA (default=1)
- If nonzero (as it is by default), malloc() and realloc()
- will fill all uninitialized
- bytes with the value of this variable, and free() and realloc()
- will fill all free'd bytes with 2's.
- MALLOC_INFO_ATEXIT (default=0)
- If this variable is set, malloc_info(0,-1) will
- be called during exit().
- MALLOC_PROMPT_ON_STARTUP
- If set, during the first call to malloc(),
- the following message will be sent to /dev/tty:
- <commandname>(<process-id>): hit return to continue
- and one character will be read from /dev/tty before continuing.
- This may be useful for identifying "mystery programs" that
- are being run from other programs.
- MALLOC_STACKTRACE_GET_DEPTH (default=0)
- Initializes the value of the malloc_stacktrace_get_depth
- variable, which controls the depth of the stack trace
- to store during each malloc. See "malloc_stacktrace_get_depth"
- above for more info.
- MALLOC_SUPPRESS
- Use this variable to suppress messages coming
- from a particular program at a particular address.
- See "SUPPRESSING ERROR MESSAGES" above for details.
- _MALLOC_TRY_TO_PRINT_STACKTRACES
- If set, an attempt will be made to print stacktraces on error,
- even if _RLD_LIST is set. (Usually stack traces are not
- attempted if _RLD_LIST is set, since _RLD_LIST usually
- means libdmalloc.so is being used, which means stack tracing
- will probably core dump).
- _MALLOC_DONT_TRY_TO_PRINT_STACKTRACES
- If set, no attempt will be made to print stacktraces on error,
- even if _RLD_LIST is not set. (see _MALLOC_TRY_TO_PRINT_STACKTRACES).
- _STACKTRACE_ARGV0
- If set, the function stacktrace_get_argv0() will
- return the value of this variable, rather than trying
- to get the value of argv[0] from the stack.
- This may be useful for programs whose main() overwrites
- argv.
-